Maithili Sentence Aligned Speech Corpus (Tirhuta Script)

0 reviews requests (2)

Owner Central Institute of Indian Languages

Catalogue Number: 1507

Stock In Stock

OverView

41:54:30 hours | 26 GB | 21,412 Audio Segments | 300 speakers The LDC-IL Maithili Sentence Aligned Speech Corpus(Tirhuta Script) dataset comprises audio files in wav format, accompanied by a corresponding textual layer containing

Please Login to see the price

Tags: Maithili; Sentence; Aligned; Speech; Corpus

Categories Cart Account Search Recent View Go to Top

Dataset Description

41:54:30 hours | 26 GB | 21,412 Audio Segments | 300 speakers

The LDC-IL Maithili Sentence Aligned Speech Corpus(Tirhuta Script) dataset comprises audio files in wav format, accompanied by a corresponding textual layer containing

phonetically normalized and orthographically normalized annotations in

Tirhuta Script. This dataset spans a duration of 41:54:30(hh:mm:ss) , consisting of read speech with continuous text, representative sentences, and date formats. The data is derived from 147 female and 153 male native Maithili speakers, encompassing diverse age groups and regions. A comprehensive explanation of dataset can be

found in the The LDC-IL Maithili Sentence Aligned Speech Corpus(Tirhuta Script) Documentation.

For any research-based citations, please use the following citations:

Dinesh Mishra, Shantanu Kumar, Dr. Narayan Kumar Choudhary, Rajesha N., Prof. Shailendra Mohan. Maithili Sentence Aligned Speech Corpus(Tirhuta Script). Central Instituteof Indian Languages, Mysore. 978-93-48633-51-4
Rejitha K. S. and Narayan Kumar Choudhary. (ed.). 2025. LDC-IL Corpus Insights. Central Institute of Indian Languages, Mysore. 978-93-48633-33-0

Item specifics

Authors Dinesh Mishra, Shantanu Kumar, Dr. Narayan Kumar Choudhary, Rajesha N., Prof. Shailendra Mohan.
Corpus Type Maithili Sentence Aligned Speech Corpus
Catalogue Number 1507
ISBN 978-93-48633-51-4
Data Source On Field
Duration 41:54:30 hours
# of Audio Segments 21412
Release Date 2025/03/20
Terms and Conditions General instructions for use of the resources provided by LDC-IL.

Maithili Sentence Aligned Speech Corpus (Tirhuta Script)

OverView

Maithili Sentence Aligned Speech Corpus (Tirhuta Script)

Dataset Description

Item specifics

Write a review